Towards a mixed approach to extract biomedical terms from documents

نویسندگان

  • Juan Antonio Lossio Ventura
  • Clement Jonquet
  • Mathieu Roche
  • Maguelonne Teisseire
چکیده

The proposed work aims at automatically extracting biomedical terms from free text. We present new extraction methods taking into account linguistic patterns specialized for the biomedical field, statistic term extraction measures such as C-value and statistic keyword extraction measures such as Okapi BM25, and TFIDF. These measures are combined in order to improve the extraction process and we investigate which combinations are the more relevant associated to different contexts. Experimental results show that an appropriate harmonic mean of C-value associated to keyword extraction measures offers better precision, both for single-word and multi-words term extraction. Experiments describe the extraction of English and French biomedical terms from a corpus of laboratory tests available online. The results are validated by using UMLS (in English) and only MeSH (in French) as reference.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards a Mixed Approach to Extract Biomedical Terms from Text Corpus

The objective of this paper is to present a methodology to extract and rank automatically biomedical terms from free text. The authors present new extraction methods taking into account linguistic patterns specialized for the biomedical domain, statistic term extraction measures such as C-value and statistic keyword extraction measures such as Okapi BM25, and TFIDF. These measures are combined ...

متن کامل

Explaining Competency model of teacher

Purpose: The concept of teacher competency is one of the key concepts in the document of the fundamental transformation of education. Different researchers have presented three approaches in various terms for modeling competency such as adaptive approach, adaptive-design approach, and design approach. Top-down and bottom-up approach is another division of competency modeling. Methodology: The p...

متن کامل

Towards a context sensitive approach to searching information based on domain specific knowledge sources

In the context of document retrieval in the biomedical domain, this paper introduces a novel approach to searching for biomedical information using contextual semantic information. More specifically, we propose to combine the contextual semantic information in documents and user queries in an attempt to improve the performance of biomedical information retrieval (IR) systems. Contextual informa...

متن کامل

Extracting Conceptual Terms from Medical Documents

Automated biomedical concept recognition is important for biomedical document retrieval and text mining research. In this paper, we describe a two-step concept extraction technique for documents in biomedical domain. Step one includes noun phrase extraction, which can automatically extract noun phrases from medical documents. Extracted noun phrases are used as concept term candidates which beco...

متن کامل

A Maximum-Entropy approach for accurate document annotation in the biomedical domain

The increasing number of scientific literature on the Web and the absence of efficient tools used for classifying and searching the documents are the two most important factors that influence the speed of the search and the quality of the results. Previous studies have shown that the usage of ontologies makes it possible to process document and query information at the semantic level, which gre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013